AITopics | comprehensive score

Collaborating Authors

comprehensive score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CEFW: A Comprehensive Evaluation Framework for Watermark in Large Language Models

Zhang, Shuhao, Cheng, Bo, Han, Jiale, Chen, Yuli, Wu, Zhixuan, Li, Changbao, Gu, Pingli

arXiv.org Artificial IntelligenceMar-24-2025

Text watermarking provides an effective solution for identifying synthetic text generated by large language models. However, existing techniques often focus on satisfying specific criteria while ignoring other key aspects, lacking a unified evaluation. To fill this gap, we propose the Comprehensive Evaluation Framework for Watermark (CEFW), a unified framework that comprehensively evaluates watermarking methods across five key dimensions: ease of detection, fidelity of text quality, minimal embedding cost, robustness to adversarial attacks, and imperceptibility to prevent imitation or forgery. By assessing watermarks according to all these key criteria, CEFW offers a thorough evaluation of their practicality and effectiveness. Moreover, we introduce a simple and effective watermarking method called Balanced Watermark (BW), which guarantees robustness and imperceptibility through balancing the way watermark information is added. Extensive experiments show that BW outperforms existing methods in overall performance across all evaluation dimensions. We release our code to the community for future research. https://github.com/DrankXs/BalancedWatermark.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.20802

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Research on Effectiveness Evaluation and Optimization of Baseball Teaching Method Based on Machine Learning

Sun, Shaoxuan, Yuan, Jingao, Yang, Yuelin

arXiv.org Artificial IntelligenceNov-24-2024

In modern physical education, data-driven evaluation methods have gradually attracted attention, especially the quantitative prediction of students' sports performance through machine learning model. The purpose of this study is to use a variety of machine learning models to regress and predict students' comprehensive scores in baseball training, so as to evaluate the effectiveness of the current baseball teaching methods and put forward targeted training optimization suggestions. We set up a model and evaluate the performance of students by collecting many characteristics, such as hitting times, running times and batting. The experimental results show that K-Neighbors Regressor and Gradient Boosting Regressor are excellent in comprehensive prediction accuracy and stability, and the R score and error index are significantly better than other models. In addition, through the analysis of feature importance, it is found that cumulative hits and cumulative runs are the key factors affecting students' comprehensive scores. Based on the results of this study, this paper puts forward some suggestions on optimizing training strategies to help students get better performance in baseball training. The results show that the data-driven teaching evaluation method can effectively support physical education and promote personalized and refined teaching plan design.

baseball training, prediction, student, (15 more...)

arXiv.org Artificial Intelligence

2411.15721

Country:

North America > United States > California (0.04)
Europe > Italy > Sicily (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports > Baseball (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Research on Tibetan Tourism Viewpoints information generation system based on LLM

Qi, Jinhu, Yan, Shuai, Zhang, Wentao, Zhang, Yibo, Liu, Zirui, Wang, Ke

arXiv.org Artificial IntelligenceJul-18-2024

Tibet, ensconced within China's territorial expanse, is distinguished by its labyrinthine and heterogeneous topography, a testament to its profound historical heritage, and the cradle of a unique religious ethos. The very essence of these attributes, however, has impeded the advancement of Tibet's tourism service infrastructure, rendering existing smart tourism services inadequate for the region's visitors. This study delves into the ramifications of informational disparities at tourist sites on Tibetan tourism and addresses the challenge of establishing the Large Language Model (LLM) evaluation criteria. It introduces an innovative approach, the DualGen Bridge AI system, employing supervised fine-tuning techniques to bolster model functionality and enhance optimization processes. Furthermore, it pioneers a multi-structured generative results assessment framework. Empirical validation confirms the efficacy of this framework. The study also explores the application of the supervised fine-tuning method within the proprietary DualGen Bridge AI, aimed at refining the generation of tourist site information. The study's findings offer valuable insights for optimizing system performance and provide support and inspiration for the application of LLM technology in Tibet's tourism services and beyond, potentially revolutionizing the smart tourism industry with advanced, tailored information generation capabilities.

fine-tuning, information, viewpoint, (15 more...)

arXiv.org Artificial Intelligence

2407.13561

Country:

Asia > China > Tibet Autonomous Region (0.77)
Asia > China > Sichuan Province > Chengdu (0.06)

Genre: Research Report (0.85)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese

Xu, Liang, Xue, Hang, Zhu, Lei, Zhao, Kangkang

arXiv.org Artificial IntelligenceFeb-1-2024

We introduce SuperCLUE-Math6(SC-Math6), a new benchmark dataset to evaluate the mathematical reasoning abilities of Chinese language models. SC-Math6 is designed as an upgraded Chinese version of the GSM8K dataset with enhanced difficulty, diversity, and application scope. It consists of over 2000 mathematical word problems requiring multi-step reasoning and providing natural language solutions. We propose an innovative scheme to quantify the reasoning capability of large models based on performance over problems with different reasoning steps. Experiments on 13 representative Chinese models demonstrate a clear stratification of reasoning levels, with top models like GPT-4 showing superior performance. SC-Math6 fills the gap in Chinese mathematical reasoning benchmarks and provides a comprehensive testbed to advance the intelligence of Chinese language models.

reasoning, red envelope, sc-math6, (12 more...)

arXiv.org Artificial Intelligence

2401.11819

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Comprehensive Score: Towards Efficient Local Search for SAT with Long Clauses

Cai, Shaowei (Griffith University) | Su, Kaile (Peking University)

AAAI ConferencesAug-3-2013

It is widely acknowledged that stochastic local search (SLS) algorithms can efficiently find models of satisfiable formulae for the Boolean Satisfiability (SAT) problem. There has been much interest in studying SLS algorithms on random $k$-SAT instances. Compared to random 3-SAT instances which have special statistical properties rendering them easy to solve, random $k$-SAT instances with long clauses are similar to structured ones and remain very difficult. This paper is devoted to efficient SLS algorithms for random $k$-SAT instances with long clauses. By combining a novel variable property $subscore$ with the commonly used property $score$, we design a scoring function named {\it comprehensive score}, which is utilized to develop a new SLS algorithm called CScoreSAT. The experiments show that CScoreSAT outperforms state-of-the-art SLS solvers, including the winners of recent SAT competitions, by one to two orders of magnitudes on large random 5-SAT and 7-SAT instances. In addition, CScoreSAT significantly outperforms its competitors on random $k$-SAT instances for each $k=4,5,6,7$ from SAT Challenge 2012, which indicates its robustness.

comprehensive score, efficient local search, long clause

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.60)

Add feedback